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Abstract 

In this paper we present a new approach to control variates for improving compu- 
tational efficiency of Ensemble Monte Carlo. We present the approach using sim- 
ulation of paths of a time-dependent nonlinear stochastic equation. The core idea 
is to extract information at one or more nominal model parameters and use this 
information to gain estimation efficiency at neighboring parameters. This idea is the 
basis of a general strategy, called DataBase Monte Carlo (DBMC), for improving 
efficiency of Monte Carlo. In this paper we describe how this strategy can be imple- 
mented using the variance reduction technique of Control Variates (CV). We show 
that, once an initial setup cost for extracting information is incurred, this approach 
can lead to significant gains in computational efficiency. The initial setup cost is 
justified in projects that require a large number of estimations or in those that are 
to be performed under real-time constraints. 
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1 Introduction 



The purpose of this paper is to present a novel approach for efficient estimation 
via the Monte Carlo (MC) method. The approach is very broadly applicable 
but here, to present the main ideas, we narrow the focus to Ensemble Monte 
Carlo where estimation is based on stochastically independent trajectories of a 
system. To illustrate, we use simulation of time-dependent nonlinear processes 
for which Monte Carlo is a particularly general and powerful numerical method 
compared to available alternatives. Time-dependent nonlinear processes are 
very general models used, among others, in statistical mechanics data 
assimilation in climate, weather and ocean modeling [2], financial modeling 
PJ, and quantitative biology |4j. Hence developing efficient MC methods may 
significantly impact a wide range of applications. 

A known weakness of MC is its slow rate of convergence. Assume F is a 
random quantity defined on paths of a process and let oy denote its standard 
deviation. The convergence rate of MC for estimating the expected value of 
y is ~ (jy / ^Jn where n is the number of independent paths of the process. 
In general the canonical n~^/^ rate of convergence cannot be improved upon, 
hence, since the inception of the MC method, a number of variance reduction 
(VR) techniques have been devised to reduce ay (see, [5] for an early account 
and [3| and [6] for more recent discussions). 

Most VR techniques lead to estimators of the form 

W]Y\ H V WnYn, 

i.e., a weighted average of the samples. These techniques prescribe (i) a recipe 
for selecting samples Yi, - ■ ■ ,¥„ and (ii) a set of weights Wi, ■■■ ,Wn. To arrive at 
these prescriptions, one must rely on the existence of specific problem features 
and the ability of the user of the method to discover and effectively exploit 
such features. This lack of generality has significantly limited the applicability 
of VR techniques. 

The point of departure of a new strategy, called DataBase Monte Carlo (DBMC), 
is to address this shortcoming and to devise generic VR techniques that can 
be generically applied [7]. All VR techniques bring additional information to 
bear on the estimation problem, however, as mentioned above, this informa- 
tion is problem specific and relies on exploiting special features of the prob- 
lem at hand. By contrast, as will be clarified in this paper, DBMC adds a 
generic computational exploration phase to the estimation problem that re- 
lies on gathering information at one (or more) nominal model parameter (s) 
to achieve estimation efficiency at neighboring parameters. The advantage of 
this approach is its generality and wide applicability: it is quite easy to im- 
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plement and it can wrap existing ensemble MC codes. On the other hand, 
the computational exploration phase of the DBMC approach may require ex- 
tensive simulations and can be computationally costly. Therefore, the initial 
setup cost needs justification. The setup cost may be justified in projects that 
involve estimations at many model parameters and/or in projects where there 
is a real-time computational constraint. In the first type of project, the setup 
cost may lead to efficiency gain for each subsequent estimation, and for a large 
enough number of subsequent estimations it can be easily justified. In projects 
with a real-time constraint the setup cost is an off-line "passive" cost that can 
lead to estimates of significantly higher quality (lower statistical error); the 
higher quality in many such projects more than justifies the setup cost. 

In this paper we limit ourselves to presenting the implementation of the VR 
technique of Control Variates (CV) in the DBMC setting (see [7] for discus- 
sion of other VR techniques). The CV technique, which compared to the VR 
technique of Importance Sampling is less utilized in computational physics, 
requires identifying a number of random variables called control variates, say 
Xi, ■ ■ ■ , Xji, that are correlated with Y and have known means. The correlation 
with Y implies that Xj's carry information about Y. The CV technique is a 
way of utilizing the information included in the controls (their known means) 
to help with the estimation of the mean of variable Y. In the DBMC setting we 
assume that Y = Y{9) depends on a model parameter 9 and use Xj = Y{9i) 
where ^i's are in a neighborhood of ^ (i = 1, • • • , fc). In a departure from the 
classical CV technique, we use "high quality" estimates of E[Xi] rather than 
precise values of -E[Xj] to arrive at the controlled estimator of -Ef^]- As we 
argue in this paper (and elsewhere [8]) this departure allows for substantially 
broader choices of control variates and makes the CV technique significantly 
more flexible and effective. 

The DBMC method shares a similar intent as the well-known histogram 
reweighing method ^ from the Markov chain Monte Carlo literature (e.g. 
[T]), but with a very different setting and implementation, and with broader 
applicability. For example, it does not rely on having a Boltzmann distribution 
or exp{—H/kT) structure. Given its generahty, it has potential applications, 
among others, in ensemble weather prediction, hydrological source location, 
climate and ocean, optimal control, and stochastic simulations of biological 
systems. 

The remainder of the paper is organized as follows. In section [2] we discuss 
preliminaries, including the details of the example numerical study - the time- 
dependent Ginzburg-Landau (TDGL) equation - as well as the method of 
control variates. Estimation of mean outcomes of the TDGL equation over a 
range of temperatures is of interest, especially considering the large difference 
in behavior below and above the coexistence curve. In section[3]we describe the 
DBMC methodology and motivation in a general context. Section H] discusses 
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the implementation and results of DBMC as applied to estimation of quantities 
generated by the TDGL equation, and the results of that numerical study. We 
conclude in section O 



2 Preliminaries 



We present aspects of our approach and numerical results in the context of 
the time-dependent Ginzburg Landau (TDGL) model. It is worth noting that 
this model is chosen for illustrative purposes only and we do not make use of 
any of its specific features. 



2.1 Time- Dependent Ginzburg Landau 



We use a canonical equation of phase-ordering kinetics PHIITT] the stochastic 
TDGL equation in two spatial dimensions. This is written as 



= DA0(x, t) - V'(0(x, t)) + r^(x, t) (1) 

where 0(x, t) represents a local order parameter, e.g. a magnetization at point 
X = (xi,X2)^ and time t denotes transpose). The noise has mean zero 
and covariance (^^(x, t)ri{-x', t')) = 25(x — x.')6(t — t'). We choose a double-well 
potential ^(0) = —§0^ + f^^- As in [llj x is a constant, and ^ is a function 
of temperature such that a high 9 corresponds to a low temperature. 

We use a discrete form of ([1]) using a forward Euler-Maruyama stochastic 
integrator and a 5-point stencil for the Laplacian (denoted A^,) for simulation: 



0(x, t + 5t) = 0(x, t) + DStAL^{x, t) - (5t[-00(x, t) + x0^(x, t)] 



+ ^2{6t/6x)N{^, t) 



with time step 6t and lattice spacing 6x, and where N{'x,t) are independent 
and identically distributed standard normal random variables for each space- 
time point (x, t). What follows applies to other discretization schemes as well. 



2.2 Estimation problem 



To cover a broad range of estimation problems, we consider the estimation of 
quantities related to a specific space-time point, quantities that are global (en- 
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tire lattice at a particular time) and quantities that depend on the entire time 
evolution of the system. Specifically, we consider the following representative 
quantities: 

(PI) Point magnetization: 0(x, t), 

(P2) Total magnetization at a specific time t: I]x0(x;^)) ^^id 
(P3) Total space-time magnetization: I]t Z^x '^)- 

The problem of estimating the expected value of any one of the above quan- 
tities can be represented by: 

J{e) = E[YiLu;e))] 

where u is a. vector of random numbers representing all the noise/uncertainty 
in a single complete path of the dynamics; 6 is the temperature related 
parameter; Y{u;9) is the random sample of a quantity of interest (e.g., the 
magnetization from a single sample path), and E[-] denotes expectation. Note 
that knowing the noise u and parameter 6 completely determines the path (p 
and the sample quantity of interest Y{uj; 9). 



2.3 The control variate technique 

Here we give a brief review of the classical control variate (CV) technique for 
variance reduction (see [3] [12] )• 

Let Y = Y{uj]9), J = E\Y]. Assume Xi, ■ ■ ■ , are random variables (called 
control variates) that are correlated with Y and assume their means 
are known. Let X = (Xi, ■ ■ ■ , Xfc)^, E\X\ = (E[Xi], ■ ■ ■ , E[Xfe])^, and (3 = 
(/?!, ■ ■ ■ , . Then Z, defined below, is a controlled estimator of E[Y] 

k 

Z = Y + J2PiiX, - E[X,]) = Y + /3T(X - ^[X]) 

i=l 

The estimator Z uses information included in samples of the controls (the 
degree of their deviation from their known means) to "correct /adjust" the 
estimator Y and bring it closer to its unknown mean. This is the key idea of 
CV. (Alternatively, Z can be viewed as the fitted value of Y when Y is linearly 
regressed on variables Xi, ■ ■ ■ , X^. In other words, Z includes the part of the 
variation in Y that cannot be "explained" by Xj's.) 

Z is an unbiased estimator of E[Y] for all vectors f3. The coefficient vector 
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that minimizes the variance of Z is: 



where Ex is the k x k covariance matrix of X and Sxy is the k x 1 vector 
of covariances of Y and Xj's. When (3° is used, the variance of Z is given by 
(1 - R^)a^ where = Ej^Sx^Sxy/af-. Therefore, 

VariY) 



Var{Z) 



1-RT (2) 



and hence (1 — is precisely the theoretical degree of variance reduction 

if the controlled estimator Z = Y + j3°~^(X. — E[K.]) is used to estimate J as 
opposed to the crude MC estimator Y, and it is called the Variance Reduction 
Ratio (VRR) statistic for control variates. Note that there is no upper limit to 
the degree of achievable variance reduction since R^ can potentially be very 
close to 1 when the controls are highly correlated with the estimation variable 
Y. In other words, the CV technique can potentially be very effective leading 
to orders of magnitude of variance reduction. 

In practice and in general. Ex and Sxy (i.e. /3°) are not known exactly and 
need to be estimated from samples of Xj's and Y. Typically, f3° is estimated 
from the same samples used to construct the controlled estimator Z. While this 
practice adds some bias for small sample sizes, and thus makes the effective 
decrease in estimator mean squared error not precisely equal to the variance 
reduction ratio (1 — i?^)"^, this bias converges to zero faster than the standard 
error of Z. Thus, expending computational resources into generating separate 
pilot samples for estimating j3° is not considered to be justifiable. For an 
insightful and detailed discussion of the CV technique, see [3] . 



2.4 Challenges in using the CV technique 



The critical task for using the CV technique is in finding effective controls. 
Once the controls are selected, the rest of the procedure is fairly routine. An 
effective control, say X, needs to satisfy two requirements (to simplify the 
discussion we consider a scalar control): 

(Rl) X needs to be correlated with Y, and 

(R2) E[X] needs to be available to the user, i.e., known. 

The main barrier to finding effective controls is the second requirement, namely 
the requirement of a known mean A modification of the CV technique 

called Biased Control Variate (BCV) reduces the burden of requirement (R2) 
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by allowing for a good approximation of E[X] when E[X] cannot be evalu- 
ated analytically [T3]. While BCV lowers the requirement barrier and expands 
the range of available choices for effective controls, it nonetheless limits its 
potential scope by implicitly assuming an analytic path to arriving at the ap- 
proximate value. As we describe in the next section, in the DBMC approach 
we turn the second requirement into a computational task; in other words, we 
use statistical estimation to obtain a good estimate of Therefore, barrier 

(R2) is completely removed and the range of choices of controls is dramatically 
expanded. The relevant question now becomes whether the computational in- 
vestment in estimating E[X] pays enough dividends to make the investment 
worthwhile. 



3 DBMC & Control Variate 

The starting point of the DBMC approach is the observation that in many 
parametric estimation settings, including in the example considered in this 
paper, quantities Y{6) and Y{6') are highly correlated when the same random 
input Lu is used to generate them and when 9 and 9' are closJ^. This suggests 
using control variates Xi = Y{9i), i = 1, ■ ■ ■ , k, when estimating Y[9) where 
^i's are "close" to 9. 

While we have identified potentially effective controls, we do not have sufficient 
information about them, i.e., J{9i) = E[Xi\ is not known and needs to be 
evaluated. This brings us to the second feature of the DBMC method that 
corresponds to its initial computational information gathering/ set up stage. 
This stage corresponds to statistical estimation of J{9i). Details are given 
below. 

3.1 DBMC + CV algorithm 

The DBMC approach consists of a setup stage and an estimation stage. 
3.1.1 Setup stage 

The DBMC setup phase involves generating a "large" number of input ran- 
dom vectors u and obtaining "high quality" estimates of J{9i). Let DB = 

^ A similar observation is the basis for the histogram re- weighting methods: "from 
a simulation at a single state point (characterized in an Ising model by choice of 
temperature T and magnetic field H) one does not gain information on properties 
at that point only, but also in the neighboring region," ([1], page 116) 
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{u!i,LJ2, ■ ■ ■ ,ujn} {N "large") denote a large set of random inputs. This set 
represents the database. Given the database, the averages of the controls are 
precisely calculated. A schematic of this stage is given in Figure [H 



(1) Forj = l,---,iV 

(a) Generate uj according to the distribution of the inputs; 

(b) For i = 1, ■ ■ ■ , 

(i) Simulate the path (f){uj] 6i) 

(ii) Evaluate the value of the control Xi{ijjj) = Y{ujj; Oi) 

(2) Fori = 1,---,A; 

(a) Find JoBiOi), the average of the ith control on the datebase, as 

1 ^ 



Fig. 1. DBMC setup stage 



3.1.2 Estimation stage 

To estimate J{0), at a close to 6i's {i = 1, ■ ■ ■ , fc) select a "small" sample 
(say of size n <^ A^) uniformly from the database. For each sample utj re- 
simulate the equation using Uj and 6 to obtain Y{ujj;6). For these samples 
the values of the controls Xi{ujj) are available in the database. Using these 
evaluate a controlled estimate of J{0). A schematic version of these steps is 
given in Figure [2l 



(1) For j = l,---,n 

(a) Select Uj uniformly from the database; 

(b) Simulate the path (p^Uj; 9); 

(c) Evaluate the estimation variable Y{ujj; 6). 

(2) Find the controlled estimator of J {9): 

-in k 

lv{e) = -Y.[Y{uj,-e) + Y.^".iU^3) - JDB{e.))] (3) 

^ 0=1 i=l 



Fig. 2. DBMC estimation stage 
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3.2 Implementation choices 

There are two general schemes for implementation of our CV approach: (II) 
corresponding to what is described above, requires storing simulation inputs 
{cuj} and outputs {Xi{u!j)} in a database for later resampling; (12) does not 
utilize resampling, so there is no storage of data beyond recording the calcu- 
lated control means. Both implementations are feasible, the first is preferable 
in most cases; the second may be preferred in some cases. We elaborate below. 

Implementation (II). 

• The database of random inputs, i.e., Uj's, are either directly stored or enough 
information about them (e.g. input seeds of a pseudo-random number gen- 
erator) is stored to be able to regenerate ooj^s precisely. 

• The k paths corresponding to 6i, i = l,---,k, i.e., (f){ujj;6i) are generally 
simulated "in parallel" as elements of a random vector, cui, are progressively 
generated. 

• For each random input, say cuj, the value of the controls, Xi{u!j), j — 
1, . . . ,N , i — 1, . . . ,k are stored. 

Implementation (12). 

• Once the setup stage is completed, the only values stored are the "high 
quality" estimates of the means of J(^i)'s, i.e., the k values JoBiGi), i — 

• At the estimation phase, n random input vectors ujj, j = l,...,n, are 
generated anew; paths at 9 and di, i — 1, - ■ ■ ,k, are simulated using new 
random inputs and for each path Xi{u!j) and Y{u!j; 9) are calculated; finally, 
using these values, the controlled estimator is evaluated. 



3.3 Statistical properties & computational efficiency 

The promise of the approach is the following: by anchoring estimation via CV 
at a few high quality estimates (at ^i, ■ ■ ■ ,^^fc), it is possible to obtain high 
quality estimates at other locations in the parameter space (at other 9) with 
far fewer samples. The actual statistical properties of the resulting estimators, 
and the computational efficiency of generating them, refiect choices made in 
implementing each given problem. For example, how much computation should 
be "invested" in the exploration phase, and which points 9i in the parameter 
space should be explored are two important questions that need further inves- 
tigation. Such choices generally involve problem dependent tradeoffs, and we 
leave them to future studies. 
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Instead, the analysis that follows is meant to provide a general and qualitative 
understanding of the statistical properties, computational efficiency and the 
tradeoffs involved. The discussion is as general as possible, but consistent 
with the numerical study described in section HJ where such implementation 
choices were made utilizing only a basic familiarity with the problem. For 
further discussion, see [8]. 



3.3.1 Statistical properties 

We give the analysis for implementation (II). In other words, assume we are 
re-sampling from the database. Analysis of implementation (12) shows similar 
estimator statistical properties. 

To simplify the discussion consider a single control, say Xi. Let J{6) = E[Y], 
J{6i) = E[Xi], ay = Var(y), a\_^ = Var(Xi). Assume a database of input 
variables are generated and let Y* and be random variables corresponding 
to Y and Xi that are generated by re-sampling (uniformly, with replacement) 
from the database. Let J* (9), ay*, J*{9i), a\* denote the means and variances 
of the re-sampled variables Y* and X*. 

Conditioned on the database, the controlled estimator is exactly the classical 
CV estimator and all results from classical CV apply. For example, for any 
scalar /3, Z* = Y* + f3{X* - J*{9i)) is an unbiased estimator of J*{9), J*{9i) 
is known, and the optimal (3° is what is prescribed by classical CV if we take 
all random variables as those defined on the database. A measure of variance 
reduction due to using a controlled estimator is 

VRR = ^ (4) 



We use the controlled estimator Z* as an estimator for J{9). Assume optimal 
/3" is used to define Z* and assume E[Z*] = J* (9) 0. In general J* (9) ^ J (9). 
Therefore, Z* is a biased estimator of J{9) where the bias is introduced by 
sampling from the database, i.e., from Y*, as opposed to from Y. We have some 
probabilistic assessment of this bias and we can reduce it by increasing the 
size of the database. Specifically, for this bias we can obtain an approximate 
1 — a probability confidence interval: 

P{\r{9) - J{9)\ < £^^^) ^ 1 - a 



^ i.e. we ignore the low order bias that results from the typical CV procedure of 
estimating the optimal P° e.g. [3], not to be confused with the resampling bias 
discussed in this section 
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where Zafi is the 1 — a/2 quantile from the standard normal distribution. In 
other words, with high probabihty the bias is of the order of 0(A^~^/^). We 
assume that for large the bias is sufficiently small to be disregarded and 
that we can focus on VRR in (jlj) as the key measure of computational gain 
in using the controlled estimator to estimate J {9). 

3.3.2 Computational efficiency 

Generating the above large database, as we pointed out earlier, corresponds 
to an initial "setup" cost. Let C be the computational cost of generating a 
sample of Y{6). This cost involves generating an u, simulating the path, and 
evaluating Y{uj,6). A reasonable assumption for many problems is that this 
cost is about the same for all 9 and u. Then the set-up cost of generating the 
database and obtaining averages of the controls is approximately NxkxC. Let 
VRR{6) denote the variance reduction ratio at 6, i.e., the ratio of the variance 
of an uncontrolled sample and that of a controlled sample at 6. Then, the 
statistical error of a controlled estimator based on n samples is approximately 
the same as that of n x VRR{6) samples of an uncontrolled estimator. Thus, 
the ratios of the computational costs of the two estimators (to arrive at the 
same statistical accuracy) is (n x VRR{0) xC)/{nxC) = VRR{6) . Therefore, 
VRR{-) can serve as a measure of benefit of the DBMC approach. 

The setup cost of the DBMC approach can be justified in two types of ap- 
plications. The first type are those applications that require solving many 
instances of the estimation problem, at many ^'s. If the total number of in- 
stances is sufficiently large, and some variance reduction is achieved on the 
average on those instances, then the large fixed set-up cost can be dwarfed by 
the total computational savings from the many estimations. The second type 
are real-time applications where the setup cost can be viewed as an off-line 
cost enabling significant efficiency gains in the critical task of real-time esti- 
mation. Typically, the "cost" of delay in such real-time estimation is higher 
and not merely computational, justifying even a much larger computational 
effort off-line. 



4 Numerical results 

The numerical results in this section are intended to give a qualitative illustra- 
tion of the efficiency gains that can be achieved using the DBMC approach. 
Specifically, we estimate the variance reduction that can be achieved over 
regular (crude) sampling, when estimating the three quantities of interest (a 
point magnetization, total magnetization at a specific time t and the total 
time-space magnetization) at a range of the parameter 6. Our choices of the 
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size of the database, number of samples used for estimation, range of parame- 
ter values, and the controls are simply for illustration purposes. However, we 
expect that the numerical results are, qualitatively, quite representative. 

We simulate the TDGL dynamics on a 40 x 40 lattice (lattice spacing 5xk = 1, 
k = 1,2) with fixed x = 1- On each path, we evolve the system for a total 
of 5000 time steps {6t = 0.01) which is sufficient for the system to exhibit 
behavior that is specific to its temperature region. The critical point for this 
system is 9c = 1.265 [11], and our parameter range of interest (1.0 to 1.5) 
extends to both sides of that critical point. 

To build a database, we simulate = 2^^ = 16384 paths and evaluate point 
magnetization, total magnetization at a specific time, and total space-time 
magnetization at two nominal values of 6, 1.2 and 1.35. 

For each quantity of interest, we consider three control variate estimators. The 
first two estimators, CV1.2 and CV1.35, use single controls corresponding to 
6 = 1.2 and 6 = 1.35, respectively. We chose to anchor our estimators at 
those two nominal values for 9 because they are located on opposite sides of 
the phase transition line 6c- The third estimator, CV2C, uses both controls 
simultaneously. 

We use n = 2^ = 256 samples for crude and CV estimators. To estimate the 
variance of these estimators, following the micro-macro simulation approach 
(see, e.g., [S]), we use 40 independent macro simulations consisting of 256 in- 
dependent micro simulations. We obtain variance estimates from each macro 
simulation and average the resulting 40 values to obtain an overall variance 
estimate. We report the ratios of the variance estimates (crude/controlled, as 
in dl])) as VRR. A sampling of VRR results for the total space-time magneti- 
zation (problem P3) is given in Table [Hand the corresponding graph is given 
in Figure [3l The graph for point magnetization (problem PI) are given in Fig. 
m and the results for the total magnetization at a time t (problem P2) are 
quite similar and are excluded. 

Table 1 

Variance reduction ratios of the estimators applied to space-time integral of the 
magnetization, J2xJ2t^i^T^)' several values of 6. 

Estimator\6l 1.150 1.175 1.225 1.250 1.265 1.300 1.325 1.375 1.400 

CV1.2 63 236 219 55 33 15 10 6 5 

CV1.35 5 6 11 16 21 59 231 245 67 

CV2C 170 709 947 332 259 300 761 513 129 

Based on these results, we draw the following conclusions: 

• Controlled estimators produce dramatic variance reduction for parameter 
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Variance Reduction Ratios for tlie time integral of tine magnetization 




Fig. 3. Variance reduction ratios of the estimators of the space-time integral of the 
magnetization, J2t ^(^' over a range of values of 9 (log scale). 



values very close to the nominal parameters and substantial variance reduc- 
tion at values moderately close to the nominal. 

• For all the estimation problems, adding the second control consistently im- 
proves performance, in some cases leading to substantial reduction in vari- 
ance (compared to single controls). Of course, by incorporating information 
from points on both sides of the critical temperature, CV2C is expected to 
give better coverage than either of the single control estimators. However, 
CV2C does better than either of the single control estimators even in their 
own regions, which suggests that each control provides relevant information 
to the estimation problem in the opposite region. 

• VRR values for the total space-time magnetization are somewhat larger 
than those for the point and total magnetization at a specific time t - we 
expect this to be true more generally for path integrals when compared with 
values at specific time instances. 
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5 Conclusions 



In this paper we described a new strategy, DataBase Monte Carlo (DBMC), 
for improving computational efficiency of Ensemble Monte Carlo. For a specific 
time-dependent nonlinear dynamics we showed that the approach can lead to 
significant efficiency gains for a range of estimation problems. Our selection 
of the controls has been ad-hoc and for illustration purposes. Further work 
is required to better understand the options available and the computational 
tradeoffs involved. To this end, our current research is focused on (i) derivation 
of more specific guidehnes for the selection of effective control variates, (ii) 
implementation of the DBMC strategy in conjunction with other variance 
reduction techniques, for example, stratification and importance sampling, 
and (iii) application of the method in some specific domains, for example, 
estimation problems in geophysical fluids and biochemical systems. 
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